Skip to content

fix: prevent zombie instances on app restart#23757

Merged
Jasonnnz merged 3 commits into
feature/fix-restart-zombiefrom
swarm/fix-restart-zombie/task-1
Apr 6, 2026
Merged

fix: prevent zombie instances on app restart#23757
Jasonnnz merged 3 commits into
feature/fix-restart-zombiefrom
swarm/fix-restart-zombie/task-1

Conversation

@Jasonnnz

@Jasonnnz Jasonnnz commented Apr 6, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Disconnect connectionManager before daemon stop in performRestart() to prevent autoWakeIfAssistantDied() from fighting the shutdown (same pattern as performRetireAsync())
  • Add isRestarting flag so applicationShouldTerminate returns .terminateNow, skipping the redundant second cli.stop() and fragile async MainActor.run dispatch that could leave the process as a zombie

Test plan

  • Click "Restart" from the menu bar and verify the old app instance terminates cleanly (no zombie process, no lingering menu bar icon)
  • Verify the new app instance launches and connects successfully after restart
  • Verify normal quit (Cmd+Q) still works correctly via the .terminateLater path

Fixes #23756.

🤖 Generated with Claude Code


Open with Devin

Disconnect connectionManager before daemon stop in performRestart() to
prevent autoWakeIfAssistantDied() from fighting with the shutdown. Add
isRestarting flag so applicationShouldTerminate returns .terminateNow,
skipping the redundant second cli.stop() and fragile async MainActor.run
dispatch that could leave the process as a zombie.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Jasonnnz Jasonnnz self-assigned this Apr 6, 2026
chatgpt-codex-connector[bot]

This comment was marked as resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Jasonnnz

Jasonnnz commented Apr 6, 2026

Copy link
Copy Markdown
Contributor Author

@codex review this PR again — the previous issues have been fixed in commit 80716e2

@Jasonnnz

Jasonnnz commented Apr 6, 2026

Copy link
Copy Markdown
Contributor Author

@devin review this PR again — the previous issues have been fixed in commit 80716e2

chatgpt-codex-connector[bot]

This comment was marked as resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Jasonnnz

Jasonnnz commented Apr 6, 2026

Copy link
Copy Markdown
Contributor Author

@codex review this PR again — the previous issues have been fixed in commit 4becbb7

@Jasonnnz

Jasonnnz commented Apr 6, 2026

Copy link
Copy Markdown
Contributor Author

@devin review this PR again — the previous issues have been fixed in commit 4becbb7

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Jasonnnz Jasonnnz merged commit 81f54cf into feature/fix-restart-zombie Apr 6, 2026
6 checks passed
@Jasonnnz Jasonnnz deleted the swarm/fix-restart-zombie/task-1 branch April 6, 2026 15:21

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +176 to +182
self?.isRestarting = false
// Reconnect SSE and health checks so the app doesn't stay
// in a disconnected state after a failed relaunch attempt.
// (Same pattern as performRetireAsync()'s cancel path.)
Task { @MainActor [weak self] in
try? await self?.connectionManager.connect()
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 isRestarting = false written off the main actor in NSWorkspace completion handler

AppDelegate is @MainActor, so isRestarting is main-actor-isolated. However, in the error path of performRestart(), the NSWorkspace.shared.openApplication completion handler runs on a background serial queue (not the main actor). The assignment self?.isRestarting = false at line 176 writes this main-actor-isolated property from a non-isolated context — a data race.

On Apple Silicon (ARM64, weak memory model), the write may never become visible to the main actor. If the restart fails and isRestarting remains stale true, a subsequent app quit triggers applicationShouldTerminate (AppDelegate.swift:753) which reads isRestarting, sees true, and returns .terminateNow — bypassing the vellumCli.stop() cleanup. The reconnection on line 180-182 is correctly wrapped in Task { @MainActor ... } but the flag reset is not.

Suggested change
self?.isRestarting = false
// Reconnect SSE and health checks so the app doesn't stay
// in a disconnected state after a failed relaunch attempt.
// (Same pattern as performRetireAsync()'s cancel path.)
Task { @MainActor [weak self] in
try? await self?.connectionManager.connect()
}
Task { @MainActor [weak self] in
self?.isRestarting = false
// Reconnect SSE and health checks so the app doesn't stay
// in a disconnected state after a failed relaunch attempt.
// (Same pattern as performRetireAsync()'s cancel path.)
try? await self?.connectionManager.connect()
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Jasonnnz added a commit that referenced this pull request Apr 6, 2026
* fix: prevent zombie instances on app restart (#23757)

* fix: prevent zombie instances on app restart

Disconnect connectionManager before daemon stop in performRestart() to
prevent autoWakeIfAssistantDied() from fighting with the shutdown. Add
isRestarting flag so applicationShouldTerminate returns .terminateNow,
skipping the redundant second cli.stop() and fragile async MainActor.run
dispatch that could leave the process as a zombie.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: reset isRestarting flag on relaunch failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: reconnect connectionManager on restart relaunch failure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: move isRestarting flag to success path to avoid premature terminateNow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant